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Estimation of Latent Ability Distributions Under Essential Unidimensionality 



In many large-scale educational assessments (such as National Assessment of 
Educational Progress) it is of interest to compare the distribution of latent abilities of 
different subpopulations, and track these distributions over time to monitor educational 
progress. Several researchers have attempted to develop methodologies to recover ability 
distribution from item response data. For example, Samejima and Livingston (1979) have 
fit polynomials to latent densities using the method of moments. Samejima (1984) also fits 
0 densities using MLE 0 by matching two or more moments. Levine (1984) projects the 
latent distribution onto a convenient function space and estimates projections by maximum 
likelihood methods. Mislevy (1984) adopts marginal maximum likelihood method to 
recover the distribution of the latent variable from the observed item response patterns. 

All the above mentioned methods rely upon the assumption of local independence 
for their validity, and are computationally intensive. Junker (1988, 1992) in association 
with Paul Holland (ETS) and William Stout (Illinois), developed a simple scheme, based 
on the proportion correct score, for smoothly approximating the ability distribution from 
binary responses. His approach is also robust to some violations of local independence. 
Namely, the methodology works for essentially unidimensional models under essential 
independence. 

Junker's Approach for Estimating the Latent Trait Distribution 

Let J denote the number of binary items, Xj = (X^Xy...,Xj), denote item response 

vector, and P^{9), P^Q)>.. n Pj(9) denote the corresponding item characteristic curves 

ICC's with respect to 0, where 0 denotes the latent trait of interest. Then, Xj = j£X. is 

J 
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the average item score, and Tj (0) = j VP 10) is the average ICC 

Under the usual assumptions of local independence (LI) and monotonicity (M), or 
more generally under Stout's (1987, 1990) formulation of essential independence (EI) and 
locally asymptotic discrimination (LAD), it can be shown that (Junker, 1988) 

is a plausible point estimator of 0. That is, 9 j(Xj) is a consistent estimator of 9 under 
either set of assumptions. The distribution of 9 J(Kj) is then given by 

Fj(t) = P[9 ^Xj)<t] (1) 

A natural estimator of the 0 distribution given in equation (1) is the "empirical" 
distribution of 0j's obtained by administering the test Xjto N examinees resulting in N 
response vectors X XJ , X 2J ,...,X N j&nd the corresponding 9 estimates SjO^ r)> 
2j)''"'0j(— Nj)' ^ e em P ir *cal distribution of 9 js is given by 

1 N 

F ^ t) ~=^l 1 1 {9/X nJ )<t} (2) 

= {fraction of 9 j(X n j)'s<.i} 

where the "indicator function" 1^ takes the value 1 if 5 is true and 0 if 5 is false. 

It has been shown that, if the distribution function F(t) is continuous, the empirical 
distribution function F^i) converges in probability to F at each t as both J-m and 
(Junker, 1988, 1992). 

3 



ERIC 



4 



Practical Limitations 

In applications, if Pjhas a lower asymptote, and if Xj<c, then TJ~^(~Xj) is set to 
Although the probability of this happening decreases to zero as J tends to infinity, it still 
does happen with some frequency when J is small. Therefore, we must be concerned with 
recovering the 0 distribution whenever Xys fall below the lower asymptote c (and 
similarly for an asymptote d < 1). Two adjustments were made to overcome this problem. 



(i) Replace the point estimator ©jwith 



J.X J+ 1 



J + 2 



(0j*^ also converges to 0 and is bounded if the asymptotes of Fj are 0 and 1) 

This first adjustment takes care of pbar's when the lower asymptote is 0 and the 
upper asymptote is 1. As a result of this adjustment the variance of the estimated theta 
distribution shrinks slightly, but this shrinkage reduces as the test length increases. 

(ii) The numerical inverter of the function Fj is written (on the computer) such that it 
finds the root of a linear extrapolation of Fj(t)=Xj when Xj lies outside the asymptotes of 

The second adjustment takes care of adjustments that the first adjustment can not 
handle. For example, if the guessing parameter c>0, for cases where X < c, the numerical 
inverter approximates and assigns a finite value for 0j . This adjustment also occurs less 
frequently as the test length grows. 
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Kernel Smoothing 



The basic estimator presented in equation (1) can also be written as 



(3) 



where P^[.] is the estimator of the discrete distribution of Xj based on AT observations. 
K(u) is constant except for a jump from 0 to 1 at u=0, and h is any fixed positive number. 

In cases where the distribution is a truly continuous one, the performance of jin 
equation (2) can be improved by replacing the discrete function K(u) with a continuous 
distribution function K(u) which increases from 0 to 1 as u ranges from -u> to ©. Let 



71=1 



t-¥-\j/J) 



(4) 



denote the smoothed estimator obtained by replacing K with K. In equation (4) h (window 
width) is a parameter of the smoothing function. If h is large the smoothing function 
increases slowly, and if h is near zero, the smoothing function is steeper. 

A practical question is: given Nand J, what is a reasonable choice for h so as to get 
best possible estimator of 6? The formula for h (Silverman, 1986 pp.45-48; Reiss, 1981) is 
given by 
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h= Cr l > b (var O) 1 / 2 



(5) 



where Cis an unknown constant that may be determined experimentally. 



The smoothing kernel K(t) is given by 



K(t) = 



'0, t < -1 
l/4(3t-t 3 +2), |i|<l 



1 



t> 1 



Other smoothing kernels could be chosen besides K(t) shown here. The advantages 
of K(t) are: (a) it is easy and fast to compute, (b) it is conservative about the tails of the 
estimated distribution. 

In Junker's (1988, 1992) study, C=l/3 was chosen and var(0) was approximated by 
the interquartile range of a uniform distribution. Junker investigated the performance of 
both the discrete empirical distribution function (EDF) in equation (3), and the kernel 
(smoothed) distribution estimate (KDE) in equation (4) in a Monte Carlo experiment. The 
following parameters were varied: test length (10, 30, 60, 100), ability distributions 
(normal, bimodal, discontinuous), ICC for generation (Rasch, 3PL), ICC for recovery 
(Rasch, 3PL, and 3PL with noise introduced). Sample sizes of 5000 examinees were 
simulated in all cases. 

Junker's results showed that both KDE and EDF were able to recover the 0 
distribution very well in all cases, with KDE performing better than EDF, especially for 
short tests. As the test length increased, the distance measures, RMS, decreased, and the 
smoothness of the distribution and the density plots improved. 
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The goals of the present study were to refine the smoothing parameter h and 
replicate the results obtained by Junker (1992); to investigate the performance of EDF and 
KDE in estimating the ability distribution under essential unidimensionality as opposed to 
strict unidimensionality; and to illustrate the methodology on a real data set. 



Refinement of the Smoothing Parameter h 

The aim was to find the best method for estimating the variance calculation of 0, 
and best value for Cin equation (5). Three methods of estimating variance were 
considered: 

1. The interquartile range of the uniform distribution (same as before) (VI) 

2. the interquartile range , stimated from the frequency distribution of the 
observed data (V2) 

3. the direct estimation of the variance from the frequency distribution of the 
observed data (V3). 

Four different values for C were considered: C = 1 ., 1/2, 1/3, 1/4. 

In order to achieve the best combination of C and variance estimation a 3x4 design 
was used. 

Table A 

Different combinations of C and variance for computing h 





Cl=l 


C2=l/2 


G3=l/3 


01=1/4 


VI 


CI VI* 


C2V1* 


CSV1* 


C4V1 


V2 


C1V2* 


C2V2* 


csvs 


cm 


VS 


CIVS* 


C2VS 


csvs 


cm 
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For each cell of Table A we studied the performance of the smoothed estimator 
KDE and the unsmoothed estimator EDF by varying the following parameters. Two ability 
distributions to generate 0's: Normal, bimodal; two types of ICC generation: 1PL, 3PL; two 
types of ICC's to recover the ^--distribution: 1PL, noisy 3PL (item parameters were 
deliberately contaminated with noise), two test lengths: 20, 60, and one examinee sample 
size: 1000. For each of the combinations of these parameter!* the performance of KDE and 
EDF was studied for those combinations of C and V marked with * in table A (the other 
cells did not produce promising results in initial simulations and therefore they were not 
studied further). These results are shown in Tables 1 to 4 and are based on 100 
replications. In Tables 1 to 4, RMS EDF denotes the root mean square distance measure 
between the estimated and the true distributions for EDF estimator over 500 points 
averaged over replications. Similarly RMS KDE denotes the root mean square distance for 
the smoothed estimator. STD 9 denotes the estimated standard deviation of the 0's 
averaged over replications. 

Tables 1 and 2 show the results for the normal distribution and Tables 3 and 4 show 
the results for the bimodal distribution. In each of the tables, the column under RMS EDF 
across conditions is purely random and is not affected by the C and V combination. 
However, it shows how these values bounce around over replications. The column under 
RMS KDE shows the real differences in RMS's for different conditions. For example, in 
Table 1 RMS EDF for C1V3 is much smaller than for C3 VI with 20 items. 

Figures 1-4 show the distribution and density plots for a sample of runs in Tables 
1-4. Each of the figures contains two panels. The first panel to left is the P— P plot, where 
the X-axis denotes the true distribution and the Y-axis denotes the estimated 
distribution. The step function denotes the EDF estimator and the smooth curve denotes 
the KDE estimator. The closer each of the estimators to the solid diagonal line, the better 
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the estimator. The right panel compares the density derived from the KDE estimator with 
the true density. 

The decision to choose appropriate values for V and C were based on the distance 
measures in Tables 1-4 and plots of the estimated distribution function in Figures 1-4. 
Based on these results C1V3 combination was chosen. That is, the direct variance 
estimation method with the constant C=l produced the best smoothing parameter h. 

With the new values for C and V we then repeated the study done by Junker (1992) 
to see if we get similar results. These results are shown in Tables 5 and 6. Table 5 shows 
results for the normal distribution across three types of ice generation and recovery, and 
the Table 6 shows the same for the bimodal distribution. A sample of p-p plots and 
density plots are shown in Figures 5 and 6. 

In Table 5 comparison of results across three types of ICC's shows that the distance 
measures slightly increase as the model gets more complex. That is, with guessing and 
noise present the error slightly increases. We also studied the location on the 0-scale where 
the maximum discrepancy occurs between the true and the estimated distributions. Across 
the models this location shifts to the left. In other words, as the guessing and noise are 
introduced, there is more instability in estimation at lower ability levels, as can be 
expected. The same findings were observed for bimodal distribution in Table 6. These 
results indicate that RMS and p-p plots are similar to or slightly better than those 
obtained by Junker (1992). 

Essentially Unidimensional Study 

The main goal of this stud} was to see if the results observed so far for the data 
generated with strictly unidimensional models would hold up for data generated with 
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essentially unidimensional models, where there is one dominant dimension influencing 
responses to all items, and several minor dimensions influencing responses to a few items. 
The data generated here resembled a paragraph comprehension test where the dominant 
ability influenced all items and items of each paragraph were influenced by an additional 
ability unique to the paragraph, In all there were 1+r abilities, where r denotes the number 
of paragraphs. Each item is therefore influenced by two abilities, the dominant ability 0 
and one of the minor abilities, 0^ through 0 . The abilities were generated from a bivanate 
normal distribution with zero correlation between the abilities. The item parameters were 
generated as follows. 

a 2 ~N(^,^a) 
b.- N(0,l),i-1,2 

where £ denotes the strength of the minor ability in relation to the major ability. 

The two-dimensional 3PL model was used to generate item responses, given by 

1-c. 

P,(M«) = c. + l - , 

where a is the discrimination vector, 0 is the ability vector, and b is the difficulty vector of 
item i. Three test lengths were used (20, 40, 60), and two ( values (0.2 and 0.4) were used 
in simulations. For each test length, 2items per paragraph and 5 items per paragraph were 
considered. 

Two ways of estimating (recovering) ICCs were considered. First, the two sets of 
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item parameters used for generating the item responses were manipulated as follows in 
order to obtain one parameter ICCs: 



a(i) = al (0 

Vl+a2(i)*a2(i) 

b(i) = bl(i) + (a2(i)/al(i)) b2(i), 

which were then used to obtain the EDF and the KDE estimators. This definition of a(i) 
and b(i) is exactly what we would get if we averaged over the nuisance (paragraph) traits 
in the two— dimensional compensatory normal ogive model. This is a good approximation to 
the result of averaging over the nuisance traits in the logistic model we are using. 

The results of this study are reported in Table 7 and the plots are shown in 
Figures 7-10. The RMS are within the expected range and the plots resemble those in 
strict dimensionality case. Hence one can conclude based on these simulations that the 
KDE estimator is an acceptable methodology to estimate the underlying distributions 
provided the ICCs can be well estimated. 

Secondly, in order to investigate a more practical approach for estimating the ICCs, 
item parameters were obtained by using the computer program BILOG. These ICCs were 
then used to obtain the EDF and the KDE estimators. The results of this study are shown 
in Figures 11-16. 

Figures 11-13 display distribution and density plots for the case where {=0.2 and 
Figures 14-16 display the plots for the case where {=0.4. That is, the influence of the 
minor ability in relation to the major ability is more in the later case. As can be seen from 
the figures, the distribution and densities are recovered smoothly in both cases. As the test 
length increases, the curves look smoother. 
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Real Data Study 



In order to investigate the performance of KDE estimator on a' real data set, ACT 
reading test was used. The reading test consists of 40 items and 4 paragraphs, where each 
paragraph is followed by 10 items. There were 5000 examinees is this data set. As a first 
step, DIMTEST (Stout, 1987; Nandakumar & Stout, 1993) was used to investigate if the 
reading test was essentially unidimensional. We foi^d that the last 10 items were causing 
multi dimensionality. Upon further investigation we found that the first 30 items tapped 
literature content area while the last 10 items tapped psychology content area. Moreover, 
since these were the last 10 items, speededness could have also caused the 
multi dimensionality. When these items were removed, the rest of the items were found to 
be essentially unidimensional by DIMTEST (Nandakumar, in press). The item parameters 
of this data set were estimated using BILOG, and the ability distributions were estimated 
and compared using KDE estimators for three subpopulations: students who attained the 
grade A in high school (N=1574), students who attained the grade B (N=2144), and 
students who attained the grade C (N=915). 

The comparison of distributions and densities for the three subpopulations are 
shown in Figure 17. From these plots it can be seen that the KDE estimator is smoothly 
estimating the distributions while the densities could be further improved. Notice that the 
estimated distribution for the A students is higher than those for the B and C students, 
and the estimated distribution for the B students is higher than that for the C students. 
This corresponds to our expectations, which helps to confirm the idea that the latent 
distribution estimator is performing well. 
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Summary and Discussion 



In summary, the smoothing parameter h was refined to obtain optimal values for 
the constant C and for the variance estimation in equation (5) so that the KDE estimation 
of distributions are smooth. Secondly, performance of estimators EDF and KDE were 
investigated for the data generated under essentially unidimensional models. Two methods 
of estimating ICCs were considered for this purpose: (a) the two-dimensional item 
parameters used for generating the data were manipulated to resemble one-dimensional 
item parameters, (b) the ICCs were estimated by BILOG. In both cases RMSs, and the 
distribution and density plots indicated that these estimators are acceptable methods to 
estimate underlying ability distributions. Thirdly, the performance of the KDE estimator 
was illustrated on the ACT reading test to compare the distributions of three 
subpopulations. These results further confirmed that the KDE estimator is performing well 
to estimate latent distributions. 

The KDE and EDF estimators investigated in this paper are simple, fast, and easy 
to compute methods to recover latent distributions. These estimators work for a general 
class of ICCs and are robust under violations of local independence and strict 
unidimensionality assumptions. The results of this paper illustrate promise of these 
methods for the future. 
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Table 1: Study: Smoothing Parameter, N as 1000 
^-Distribution : Normal 
ICC Generation : 1PL 
ICC recovery : lPL 



Combination 
of 

C and V 


20 Items 


60 Items 


HMb 
EDF 


RMS 
KDE 


STD 
$ 


RMS 
EDF 


RMS 
KDE 


STD 

a 

V 


C1V1 


.0205 


.0105 


1.54 


.0118 


.0095 


1.61 


Vyl V L 


.0217 


.0045 


.90 


.0259 


.0245 


.94 


C1V3 


.0219 


.0071 


,98 


.0192 


.0184 


.99 


C2V1 


.0218 


.0075 


1.55 


.0131 


.0118 


1.61 


C2V2 


.0215 


.0116 


.91 


.0194 


.0174 


1.00 


C3V1 


.0282 


.0207 


1.55 


.0161 


.0153 


1.61 



liable 2: Study: Smoothing Parameter, N as 1000 
^-Distribution : Normal 
ICC Generation : 3PL 
ICC recorery : Noisy 3PL 



Combination 
of 

CandV 


20 Items 


60 Items 


RMS 
EDF 


RMS 
KDE 


STD 
9 


RMS 
EDF 


RMS 
KDE 


STD 
9 


C1V1 


.0339 


.0329 


1.70 


.0155 


.0158 


1.80 


C1V2 


.0407 


.0374 


1.26 


.0197 


.0170 


1.05 


C1V3 


.0362 


.0318 


1*40 


.0122 


.0077 


1.02 


C2V1 


.0466 


.0382 


1.70 


.0205 


.0180 


1.80 


C2V2 


.0430 


.0375 


1.26 


.0136 


.0100* 


.90 


C3V1 


.0385 


.0325 


1.70 


.0128 


.0091 


1.80 



Tkble 3x Study: Smoothing Parameter, N = 1000 
^-Distribution : Normal Binnc^t 
ICC Generation : 1PL 
ICC recovery : lPL 



Combination 
of 

Cand V 


20 Items 


60 Items 


RMS 
EDF 


RMS 
KDE 


STD 
0 


RMS 
EDF 


RMS 
KDE 


STD 
9 


CI VI 


.0274 


.0245 


1.55 


.0239 


.0232 


1.61 


C1V2 


.0274 


.0239 


1.89 


.0152 


.0158 


2.11 


C1V3 


.0286 


.0263 


1.54 


.0134 


.0130 


1.68 


C2V1 


.0231 


.0189 


1.55 


.0172 


.0162 


1.61 


C2V2 


.0221 


.0183 


2.07 


.0111 


.0100 


2.18 


C3V1 


.0277 


.0251 


1.55 


.0096 


.0088 


1.61 



Table 4i Study: Smoothing Parameter, N as 1000 
^-Distribution : Nownal 1£>\rn cdLJL, 
ICC Generation : 3PL 
ICC recovery : Noisy 3PL 



Combination 
of 

C and V 


20 Items 


60 Items 


RMS 
EDF 


RMS 
KDE 


STD 
$ 


RMS 
EDF 


RMS 
KDE 


STD 

e 


CI VI 


.0285 


.0245 


1.70 


.0265 


.0257 


1.80 


C1V2 


.0383 


.0365 


2.17 


.0141 


.0134 


2.10 


C1V3 


.0308 


.0266 


2.28 


.0164 


.0152 


1.62 


C2V1 


.0324 


.0295 


1.70 


.0180 


.0169 


1.80 


C2V2 


.0302 


.0241 


2.17 


.0111 


.0088 


2.14 


C3V1 


.0264 


.0215 


1.70 


.0150 


.0139 


1.80 
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Tkble 5: Replicatin Study with C a 1, V a 3, JV a 1000. 
0 distribution - NormaL 

ICC Generation - ICC recovery 



Number 
of 
Items 


1 PL - 1 PL | 3 PL - 3PL | 3 PL - 3 PL Nosiy 


RMS 
EDF 


RMS 
KDF 


STD 
9 


RMS 
EDF 


RMS 
KDF 


STD 
9 


RMS 
EDF 


RMS 
KDF 


STD 
9 


20 
40 
60 


;0229 
.0164 
.0147 


.0107 
.0115 
.0119 


.96 
.97 
.98 


.0304 
.0197 
.0160 


.0212 
•0157 
.0134 


1.06 
1.04 
1.03 


.0382 
.0334 
•0163 


.0336 
.0319 
.0135 


1.35 
1.08 
1.02 



Table 6: Replicatin Study with G = 1, V == 3, <fe, N == 1000. 
9 - distribution - Bixnod&L 



Number 
of 
Items 



20 
40 
60" 



ICC Generation - ICC recovery 



1 PL - 1 PL 



3 PL - 3PL 



3 PL - 3 PL Nosiy 



RMS 
EDF 


RMS 
KDF 


STD 
9 


RMS 
EDF 


RMS 
KDF 


STD 
9 


RMS 
EDF 


RMS 
KDF 


STD 
$ 


.0284 
.0188 
.0173 


.0253 
.0174 
.0165 


1-60 
1.66 
1.70 


.0325 
.0209 
.0183 


•0279 
.0192 
.0173 


1.60 
1.70 
1.73 


.0320 
.0252 
.0192 


.0282 
.0242 
.0182 


2-26 
1.74 
1.70 



Tbble7: Essential Unidimensionality: 

Paragraph Comprehension 
ICC Generation : Normal, two-dim 3 pl 
ICC recovery : Normal, on«-dim 3 pl 





f = 


= 0.2, d E 


= 1 




0.2, d E 


= 1 




2 items/paragraph 


5 itema / paxagrah 




RMS 


RMS 


STD 


RMS 


RMS 


STD 




EDF 


KDE 


9 


EDF 


KDE 


$ 


20 


.0455 


.0379 


1.03 


.0276 


.; „10 


1.00 


40 


.0292 


.0245 


1.06 


.0256 


.0238 


1.12 


60 


.0150 


.0109 


1.01 


.0131 


.0094 


1.02 



9 

ERJC 



17 




ERIC 



a 

4- 1 

— ' 

E 
cc 

5- . 
« 

Ci- 
te 
C 



■u 

O 
O 

a 




! 1 






CP 



Figure 17: Reading Data Study 
ICCS Estimated by BILOG 
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